Chapter 2 Images, Sampling and the Frequency Domain

CHAPTER2.MCD

Feature Extraction & Image Processing,

M. S. Nixon and A. S. Aguado

Elsevier/ Butterworth Heinmann, 2nd Edition 2007

This worksheet is the companion to Chapter 2. We will start with one dimensional signal analysis, move on to sampling theory and then study processing images in the frequency domain. The worksheet follows the text directly and allows you to process basic images.

Rather than start with images (which are two dimensional - spatial - signals) we'll start with one dimensional signals. Consider for example, a signal p(t) which is a function of time t:

This signal is a continuous function of time. We can transform the signal into the frequency domain, its frequency components, using the Fourier transform which is defined as:

where P(w) is the transform, p(t) is the original signal, t is time and w=2pf is frequency.

For a pulse of length

s with amplitude

and centred on the origin x(t) is

To plot it, we'll need a time variable:

So our pulse looks like:

To obtain its frequency domain version, by substitution in the definition for the Fourier transform we get

And Mathcad's symbolic manipulation (select the whole function by clicking on the assignment operator and then choose evaluate from the symbolic menu) gives the result

Luxury!

Then we can use Mathcad's simplification system (simplify from the symbolic menu) to give

We can't plot Fp(w) directly since it is undefined for w=0. To handle the singularities, we'll change Fp(w) to be

By L'Hôpital's rule

Mathcad's spell check can handle this!!!!

By the relation for Sin(q)

We'll need to plot a range of frequencies

So our Fourier transform looks like:

Try altering the width of the pulse. Do you expect what you see in the transform?

This is the sinc function, (sin(x))/x = sinc x. It suggests that a pulse is made up of a lot of low frequencies (the biggest frequency domain components are near the origin) and only few high frequency components. Try altering the width of the pulse, making it smaller and larger. Do you expect the result you see?

The Fourier transform tells us the frequency components in the original signal. The magnitude of the transform at a particular frequency indicates how much of that frequency is in the time domain signal. Let's have a look at some of the contributions. We'll look at the contribution of two of the low frequencies first, for w = 1 and w = 2

So there's more of the low frequency (w = 1) than there is of the high frequency (w = 2) as shown in the plot of the transform. Let's look at some higher frequencies, w = 3 and w = 4

So the contribution for w = 3 is less than that for w = 4. That is to be expected since there are points where the transform components are zero (reflecting no contribution). Finally, let's look at the contributions for w = 5 and w = 6

So the contribution for w = 5 exceeds that due to w = 6. Let's integrate the contributions from the frequency components to see what we get

Integrating the frequency components has taken us back to something which is very close to the original pulse. The difference is because there are more frequency components above w=6 and when we integrate them (by extending the integration limits) then we get a signal which is much closer to the original signal.

Note that the the result of the Fourier transform is actually a complex number. We often display/ interpret the Fourier transform in terms of its magnitude and phase where (for a complex number), the magnitude is given by

and the phase is

The magnitude tells us how much of each frequency is in the original signal, the phase tells us when. The magnitude of the transform is a pulse is similar to the previously depicted transform, but in positive form. The phase varies between 0 and 2p radians, consistent with the atan function.

Magnitude

Phase

The inverse Fourier transform takes us back from the transform to the time domain. It is used to reconstruct the original time domain signal from its transform components. The inverse FT is defined as

Naturally, the earlier reconstruction of the pulse was an approximation to the inverse FT.

In order to develop transform theory further, we shall need the delta function. This has a value of one at a particular time, here at time t and it is zero for all other values of time.

The Fourier transform of a pulse is a sinc function. These are known as a transform pair, since the Fourier transform of a sinc function is a pulse. We can generate Fourier transforms by using Mathcad's symbolic transform operator. We shall use this to develop the transform pairs shown here. Other transform pairs include:


i) cosine wave

Time Domain Frequency Domain

The magnitude of the Fourier transform shows the frequency of the original cosine wave.


ii) a Gaussian function with variance

Time domain

Frequency domain

The Fourier transform is also a Gaussian function, illustrating linearity in the transform process. Try altering the standard deviation, s, how does this affect the width (in the time and in the frequency domains)?

iii) a spike (delta function) centred on the origin:

The Fourier transform shows that adding up an infinite set of frequencies gives a delta function. Check this by making the width of the earlier pulse very small, what happens to its transform?

and iv) for a sequence of spikes we need a different version of our delta function which has a value of unity between a period when it is zero. The period controls the spacing of the delta functions:

Choosing a value for the period:

and a new range of frequencies as:

Try altering the width y, do you expect what you get?

The sequence of spikes is actually used to sample a signal. If we multiply a chosen signal by the series of spikes, then we will get a sequence of samples (or instantaneous values) of the signal we have sampled. Physically, these are the outputs of an Analogue to Digital converter which feeds samples of an analogue (continuous) signal into a digital computer. If we take these samples too slowly, then we won't be able to reconstruct the original signal which was sampled. If we take the samples very quickly, we'll generate lots of data to be analysed. Nyquist's sampling theorem states that:


In order for a signal to be reconstructed from its samples, it must be sampled

at at least twice the maximum signal in the original signal.


So for speech (say maximum frequency 6 kHz) we need to sample at 12 kHz. This process gives us discrete, sampled, signals as opposed to continuous ones. Accordingly, we need a Discrete Fourier Transform (DFT) to handle these sampled signals. The DFT transforms a set of samples of a function into a set of samples of frequency. So we'll need some pointers to the sample values:

For a window size:

(a set of 10 samples)

the pointer to sample values is:

and to frequency:

The discrete Fourier transform is then defined for a signal as:

where px and Fpu are the samples of a signal and the samples of frequency, respectively. For a pulse which is of amplitude A for the first 5 samples and zero for the remaining ones, the Fourier transform is:

The result of Mathcad's evaluation (again via evaluate on the symbolic menu) is:

which simplifies to

The range of frequencies excludes zero, since this gives a singularity in the computation of Fp0. So if we redefine the range of frequencies to include zero frequency, and define Fpu accordingly, we have:

for a range of frequencies

Which gives the discrete transform pair as:

Now try altering them width of the pulse. Do you expect what you see? Is the effect the same as for continuous signals?

This is an analytic version for a simple, known, relationship. The DFT is usually used direct, without resort to symbolic evaluation, since most signals of interest do not have an explicit analytical representation. So for a sampled signal defined by:

The discrete Fourier transform of the signal is given by calculation of:

Which gives the discrete transform pair as:

The components of the transform suggest how we can add up samples of frequency to return to the original sampled signal. The result of adding these up is shown below.

The first transform coefficient is:

The original signal is:

The second transform coefficient represents:

So adding the first and second coefficients:

Up to the third coefficient we obtain:

Finally, adding all 6 coefficients gives:

This takes us back to the original signal. There are some differences between the reconstructed signal and its original version, implying that some information has been lost. This process is, of course, the inverse DFT which calculates a signal from its transformed version, according to:

We'll now move to two dimensional signals: we need a image to process. Let's use some vertical lines on a dark background, this gives an image:

We'll usually view this as a matrix when we want to see what happened to the numbers, and as an image when we want to see effects. Every picture tells......

As a reminder, Indices to P are row, column. So for a pixel Py,x the index y is downwards, starting at 0, x is across. We can also display it as a picture

The 2D Fourier transform decomposes a picture into its constituent spatial frequencies. The result is a matrix of the same size as the original picture. So when transforming p we will get an 8*8 picture; the size N is 8. We then need pointers which address the eight points:

We can obtain a two dimensional discrete Fourier transform, by using a standard version of the formula, as:

Again, this gives a complex number, so we'll evaluate its magnitude as:

or

The latter expression uses Mathcad's vectorise operator which treats all matrix elements separately. As such, we form the modulus, or magnitude, of each individual transform pixel.

The picture of this is the magnitude spectrum. This shows that we have vertical bars only; these are horizontal spatial frequencies (since brightness varies across the picture).

and the image is:

We'll use the Mathcad Fourier transform (FT) function icfft to transform the image. icfft implements the earlier FT, but using a fast algorithm, called the Fast Fourier Transform (FFT). This requires a square picture 2n*2n (i.e. for 8*8, n=3).

and we obtain the transform of an image of horizontal bars by applying this function to a transposed version of the image (matrix) of vertical bars as:

And to look at it,

We have now transformed a picture of horizontal lines. This reflects vertical spatial frequencies (since brightness varies downwards and is constant across the picture). So the resulting transform should swap round, which it does.

For pointers to the co-ordinates in the original image, we need

and

The inverse 2D Fourier transform should take us back to the original picture, from the transformed version . It's defined as:

Let's apply it:

The one we've transformed back is the Fourier transform of the transposed original picture. So we've got back to the right place!

Wonderful, so let's use the Mathcad inverse Fourier transform icfft

This gives us our transform pair. Applying the inverse FT to the result of the FT should take us back to where we started from:

It does! So let's look at the Fourier transform of an image of a square on a dark background, with a bit of added noise. This gives a matrix:

and an image

Our Fourier transform is:

Now we see a more complex arrangement of spatial frequencies, both vertical and horizontal ones. When we add these together, we can get back to our original image of a square

We actually need to shift the Fourier transform around a bit. This is because it is conventionally displayed with the lower frequency components in the centre, and the higher frequency ones outside. The lowest frequency component is actually called the d.c. component (which is like the d.c. in electricity, which is a voltage which doesn't change). The d.c. component measures the total energy in a picture and is proportional to the sum of the pixel values. Spreading out from the (central) d.c. component, the frequency increases. To picture a transform in this way, we can rotate each quadrant by 1800.

Access all rows

Access all columns

rotate bottom left

et seq.

return image

Let's try it:

So the the components now appear to be centred on the middle of the picture, rather than on the corners. This is how the textbooks usually display transform data. This is only for display purposes. If you want to get back to the original picture, you just need to apply the operator again.

By using the shift property of the Fourier transform, we can specify the rearrangement process in more compact form as:

and this is the same as the previous picture, rearranged purely for visualisation, not changing the frequency domain information. The function will be used before using the Fourier transform, or after using the inverse transform. Let's have a look at a transform of a real image. We'll display it by taking the logarithm of each point: this is so that we can see the result (try taking the logarithm out to see its effect). First, we'll read the image in:

Now, let's look at its transform:

eye image

Fourier transform of eye image

Notice how a lot of the frequency information is centred near the d.c. component. This shows how transforms can be used to code images (you just use the frequency components which carry the information, rather than the ones which have none).

Let's have a look at some transform pairs:

i) the transform of a two dimensional pulse (a column?). First, we'll specify it as:

with Fourier transform

ii) for a two dimensional Gaussian function defined by:

One of the properties of the Fourier transform is called shift invariance. This states that magnitude of the Fourier transform of a shifted (delayed) signal is the same as that of the original (unshifted) signal. We can move a picture along the y axis (wrapping round the bits that fall off - called cyclic shift) by using a function:

The modulus function

ensures wraparound

So let's shift the eye:

and we can see the result in the pictures

Shifted

Original

So we'll take our Fourier transforms, and look at the magnitude spectra:

These look pretty similar. Let's look at the real difference

Nothing!

These are the same, hence proving that shift invariance is true. This can be important in, say, texture analysis. We now know that if we take a small picture of a large texture, so long as the texture is spatially homogeneous, then the Fourier magnitude spectrum won't change wherever we take the picture. One thing does change though, that's the phase. Now:

phase of shifted image

phase of original image

So it's the phase that changes, not the magnitude. This is how phase affects how the spatial frequencies are added up. Try it for other amounts of picture shift.

If you want to show this for vertical and horizontal shifts, shifts, you'll need to move the picture by the (general) operator

The Fourier transform also rotates with the image rotation. We'll just do rotation by 90o by using matrix transformation. The transform also rotates by 90o. This is to be expected since horizontal spatial frequencies have become vertical ones and vice versa.

The last property of major interest is scaling. The scaling of frequency samples is inversely proportional to the scaling of the image samples. Naturally, this is due to the inverse relationship between time and frequency. Scaling a picture down (by division) makes its frequencies higher, as expected.

Let's now look at some other transforms, first the Discrete Cosine Transform (DCT). This is defined as:

and let's apply it to our image of a square:

We need to be able to get back to the picture domain, so we need the inverse DCT which is defined as:

Applying this to a real image would take a long time. There are routines, some based on the FFT, which allow for fast implementation. The major advantage of the DCT is energy compaction: transform images (via the DCT) have fewer components than those obtained by the FFT. As such the DCT is the major component of image coding systems, like JPEG and MPEG. Another transform is the Hartley transform: this uses purely real arithmetic to transform from one image to another as:

Let's see 'im go...

One convenience of the Hartley transform is that the inverse and forward transforms are the same:

Finally, let's look at some introductory applications of the Fourier transform. The first application concerns filtering the image. We need a spatial filter, so we'll use a circle. This is centred on the d.c. component and you choose the radius to select frequency components. If we choose those components within a circle centred on the origin, we get low frequency components and delete the high frequency ones. This is called low-pass filtering. We need a radius first, then we low-pass filter the Fourier transform:

Then we keep those points within a circle of this radius, and set those outside it to zero.

So the bits of the picture we'll retain are:

and the result of low-pass filtering is:

region selected from transform

low-pass filtered eye

The low-pass filter actually has a blurring function, due to its exclusion of high frequency components. If you set the radius to a low value, you end up with a very blurred picture. If you set the radius to a high value, excluding none of the frequency components, you'll return to the original image. Check out different values of the radius: try (say) 2, 10 and 20.

So let's look at high-pass filtering, where the low frequency components are deleted and the high frequency ones are retained. Again, we'll use a circle so we'll need a radius and a high-pass filter:

So the components the high-pass filter has kept are:

and to see the result,

retained high frequencies

high-pass filtered image

The high-pass filter actually selects regions where the brightness changes sharply, since this gives rise to high frequency information. Accordingly, the result of the high-pass filter shows up the edges of eyes, and the corners in particular. Again, try different values for the radius and see (and explain) what you get.

Note that the rearrangement process is only used to visualise the transforms. Implementation code does not need to use the rearrangement process. As an exercise, what form would the low-pass and high-pass filters take if the rearrangement process was not used?

That's the end of images, sampling and frequency domain processing. There's a lot more material that could be included (there are many more transforms, for example), but isn't since it is more in the realms of image processing as opposed to computer vision. We will however continue to use Fourier theory, sometime because it helps us to understand techniques, sometimes because it allows us fast implementation and sometimes because we need spatial frequency concepts. So don't forget it all!